arima

Ejercicio Arima

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(fpp3)
Registered S3 method overwritten by 'tsibble':
  method               from 
  as_tibble.grouped_df dplyr
── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
✔ tsibble     1.1.6     ✔ feasts      0.4.1
✔ tsibbledata 0.4.1     ✔ fable       0.4.1
── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
✖ lubridate::date()    masks base::date()
✖ dplyr::filter()      masks stats::filter()
✖ tsibble::intersect() masks base::intersect()
✖ tsibble::interval()  masks lubridate::interval()
✖ dplyr::lag()         masks stats::lag()
✖ tsibble::setdiff()   masks base::setdiff()
✖ tsibble::union()     masks base::union()
library(tidyquant)
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
── Attaching core tidyquant packages ─────────────────────── tidyquant 1.0.11 ──
✔ PerformanceAnalytics 2.0.8      ✔ TTR                  0.24.4
✔ quantmod             0.4.28     ✔ xts                  0.14.1── Conflicts ────────────────────────────────────────── tidyquant_conflicts() ──
✖ zoo::as.Date()                 masks base::as.Date()
✖ zoo::as.Date.numeric()         masks base::as.Date.numeric()
✖ dplyr::filter()                masks stats::filter()
✖ xts::first()                   masks dplyr::first()
✖ zoo::index()                   masks tsibble::index()
✖ tsibble::interval()            masks lubridate::interval()
✖ dplyr::lag()                   masks stats::lag()
✖ xts::last()                    masks dplyr::last()
✖ PerformanceAnalytics::legend() masks graphics::legend()
✖ quantmod::summary()            masks base::summary()
✖ tidyquant::VAR()               masks fable::VAR()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Attaching package: 'tidyquant'

The following object is masked from 'package:fable':

    VAR
library(plotly)

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout
library(ggplot2)
mex <-global_economy |> 
  filter(Country == "Mexico")

mex |> 
  autoplot(GDP)

Verificar si la serie es estacionaria

mex |> 
  features(log(GDP), unitroot_ndiffs)
# A tibble: 1 × 2
  Country ndiffs
  <fct>    <int>
1 Mexico       1
mex |> 
  autoplot(log(GDP) |> difference())

mex |> 
  features(log(GDP) |> difference(), unitroot_kpss)
# A tibble: 1 × 3
  Country kpss_stat kpss_pvalue
  <fct>       <dbl>       <dbl>
1 Mexico      0.289         0.1
alpha <- 0.05

0.1 < alpha #rechazar hipótesis nula si true
[1] FALSE

No se rechaza la \(H_0\), podemos asumir que es estacionaria.

Gráficos ACF & PACF

mex |> 
  gg_tsdisplay(log(GDP) |> difference(), plot_type = 'partial')

Se muestran picos negativos significativos en el rezago 1 en ambos gráficos. Se sugiere un ARIMA(0,1,1) porque la serie diferenciada ya es estacionaria y la ACF muestra un corte claro en el rezago 1, mientras que la PACF se desvanece gradualmente, patrón típico de un proceso MA(1).

Modelos Arima

gdp_fit <- mex |>
  model(
    arima_011 = ARIMA(log(GDP) ~ pdq(0,1,1)),
    arima_111 = ARIMA(log(GDP) ~ pdq(1,1,1)),
    arima_112 = ARIMA(log(GDP) ~ pdq(1,1,2)),
    arima_210 = ARIMA(log(GDP) ~ pdq(2,1,0)),
    arima_310 = ARIMA(log(GDP) ~ pdq(3,1,0))
  )

gdp_fit |> glance() |> 
  arrange(AICc)
# A tibble: 5 × 9
  Country .model    sigma2 log_lik   AIC  AICc   BIC ar_roots  ma_roots 
  <fct>   <chr>      <dbl>   <dbl> <dbl> <dbl> <dbl> <list>    <list>   
1 Mexico  arima_210 0.0201    31.9 -55.8 -55.0 -47.6 <cpl [2]> <cpl [0]>
2 Mexico  arima_011 0.0210    30.1 -54.3 -53.8 -48.2 <cpl [0]> <cpl [1]>
3 Mexico  arima_310 0.0202    32.3 -54.6 -53.4 -44.4 <cpl [3]> <cpl [0]>
4 Mexico  arima_112 0.0203    32.2 -54.3 -53.2 -44.1 <cpl [1]> <cpl [2]>
5 Mexico  arima_111 0.0209    30.8 -53.7 -52.9 -45.5 <cpl [1]> <cpl [1]>
gdp_fit |> 
  accuracy()
# A tibble: 5 × 11
  Country .model .type       ME    RMSE     MAE    MPE  MAPE  MASE RMSSE    ACF1
  <fct>   <chr>  <chr>    <dbl>   <dbl>   <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
1 Mexico  arima… Trai… -1.12e10 7.07e10 3.90e10 -1.02   9.70 0.793  1.03 -0.189 
2 Mexico  arima… Trai… -1.20e10 7.07e10 3.84e10 -0.985  9.63 0.780  1.03 -0.149 
3 Mexico  arima… Trai… -1.55e10 7.26e10 4.02e10 -0.879  9.59 0.818  1.06 -0.0483
4 Mexico  arima… Trai… -1.48e10 7.18e10 4.05e10 -0.939  9.79 0.824  1.04 -0.0277
5 Mexico  arima… Trai… -1.38e10 7.14e10 3.97e10 -0.928  9.70 0.807  1.04 -0.0816
gdp_fit |> 
  select(arima_210) |>
  report()
Series: GDP 
Model: ARIMA(2,1,0) w/ drift 
Transformation: log(GDP) 

Coefficients:
         ar1      ar2  constant
      0.2964  -0.3626    0.0849
s.e.  0.1224   0.1216    0.0184

sigma^2 estimated as 0.02007:  log likelihood=31.91
AIC=-55.82   AICc=-55.05   BIC=-47.64
gdp_fit |> 
  select(arima_111) |>
  report()
Series: GDP 
Model: ARIMA(1,1,1) w/ drift 
Transformation: log(GDP) 

Coefficients:
          ar1     ma1  constant
      -0.2516  0.6465    0.0985
s.e.   0.2203  0.1614    0.0305

sigma^2 estimated as 0.02086:  log likelihood=30.85
AIC=-53.7   AICc=-52.93   BIC=-45.52
gdp_fit |> 
  select(arima_011) |>
  report()
Series: GDP 
Model: ARIMA(0,1,1) w/ drift 
Transformation: log(GDP) 

Coefficients:
         ma1  constant
      0.4818    0.0788
s.e.  0.1280    0.0278

sigma^2 estimated as 0.02098:  log likelihood=30.15
AIC=-54.3   AICc=-53.84   BIC=-48.17

De acuerdo con los criterios de información \(AIC_c\) y \(BIC\) el mejor modelo es el arima_210 es el óptimo para este modelo, sin embargo la métrica \(MAE\) sugiere que el mejor modelo es el arima_111 y el RMSE el arima_011, esto indica que estos modelos también son competitivos por su precisión.

gdp_forecasts <- gdp_fit |> forecast(h = "10 years")

p <- gdp_forecasts |> autoplot(mex) +
  labs(title = "Pronósticos de modelos ARIMA seleccionados",
       x = "Año", y = "PIB (diferencia logarítmica)") +
    theme_minimal()

plotly::ggplotly(p)
Warning in geom2trace.default(dots[[1L]][[5L]], dots[[2L]][[1L]], dots[[3L]][[1L]]): geom_() has yet to be implemented in plotly.
  If you'd like to see this geom implemented,
  Please open an issue with your example code at
  https://github.com/ropensci/plotly/issues
Warning in geom2trace.default(dots[[1L]][[5L]], dots[[2L]][[1L]], dots[[3L]][[1L]]): geom_() has yet to be implemented in plotly.
  If you'd like to see this geom implemented,
  Please open an issue with your example code at
  https://github.com/ropensci/plotly/issues
Warning in geom2trace.default(dots[[1L]][[5L]], dots[[2L]][[1L]], dots[[3L]][[1L]]): geom_() has yet to be implemented in plotly.
  If you'd like to see this geom implemented,
  Please open an issue with your example code at
  https://github.com/ropensci/plotly/issues
Warning in geom2trace.default(dots[[1L]][[5L]], dots[[2L]][[1L]], dots[[3L]][[1L]]): geom_() has yet to be implemented in plotly.
  If you'd like to see this geom implemented,
  Please open an issue with your example code at
  https://github.com/ropensci/plotly/issues
Warning in geom2trace.default(dots[[1L]][[5L]], dots[[2L]][[1L]], dots[[3L]][[1L]]): geom_() has yet to be implemented in plotly.
  If you'd like to see this geom implemented,
  Please open an issue with your example code at
  https://github.com/ropensci/plotly/issues